Time slicing methods in dynamic networks greatly influence the accuracy of community evolution analysis results. As communities vary nonlinearly with time and network topology, both the existing uniform time slicing method and network topology variance-based nonuniform time slicing method are unsatisfactory in capturing community evolution events. Therefore, a nonuniform time slicing method based on prediction of community variance was proposed, where the community variance is quantitatively described by the difference between the community modularity expected to be achieved by the updated network and the community modularity obtained by directly applying the community detection results of the network before changing. Firstly, the prediction model of community modularity was established on the basis of time series analysis. Secondly, with the established model, the expected community modularity of the updated network was predicted, and the prediction value of community variance was obtained. Finally, once the prediction value surpassed a previously set threshold, a new time slice was generated. Experimental results on two real network datasets show that compared with the traditional uniform time slicing method and the nonuniform time slicing method based on network topology variance, on the dynamic network dataset Arxiv HEP-PH, the proposed method identifies community disappearance events 1.10 days and 1.30 days earlier, respectively, and identifies the community forming events 8.34 days and 3.34 days earlier, respectively, and the total number of identified community shrinking and growing events increased by 10 and 1 respectively. On Sx?MathOverflow dataset, the proposed method identifies community disappearance events 3.30 days and 1.80 days earlier, and identifies the community forming events 6.41 days and 2.97 days earlier respectively, and the total number of identified community shrinking and growing events increased by 15 and 7, respectively.
The recognition of spam is one of the main tasks in natural language processing. The traditional methods are based on text features or word frequency, which recognition accuracies mainly depend on the presence or absence of specific keywords. When there are no keywords or errors in recognizing keywords in the spam, the traditional methods have poor recognition performance. Neural network-based methods were proposed. Recognition training and testing were conducted on complex spam. The spams that cannot be recognized by traditional methods were collected and the same amount of normal information was randomly selected from spam messages, advertisement and spam email datasets to form three new datasets without duplicate data. Three models were proposed based on convolutional neural network and recurrent neural network and tested on three new datasets for spam recognition. The experimental results show that the neural network-based models learned better semantic features from the text and achieved the accuracies of more than 98% on all three datasets, which are significantly higher than those of the traditional methods, such as Naive Bayes (NB), Random Forest (RF) and Support Vector Machine (SVM). The experimental results also show that different neural networks are suitable for text classification with different lengths. The models composed of recurrent neural networks are good at recognizing text with sentence length, the models composed of convolutional neural networks are good at recognizing text with paragraph length, and the models composed of both neural networks are good at recognizing text with chapter length.